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Abstract:  The  past  century  has  seen  a  steady  increase  in  the  need  of  estimating 
and  predicting  complex  systems  and  making  (possibly  critical)  decisions  with  limited 
information.  Although  computers  have  made  possible  the  numerical  evaluation  of 
sophisticated  statistical  models,  these  models  are  still  designed  by  humans  because 
there  is  currently  no  known  recipe  or  algorithm  for  dividing  the  design  of  a  statistical 
model  into  a  sequence  of  arithmetic  operations.  With  the  purpose  of  addressing  this 
problem  this  program  has  developed  (1)  the  foundations  of  a  rigorous  framework 
for  the  scientihc  computation  of  optimal  statistical  estimators/models  and  (2)  the 
required  calculus  enabling  the  reduction  of  optimization  problems  over  measures  over 
spaces  of  measures  and  functions.  Two  highlights  of  the  work  accomplished  consist 
of  (1)  the  application  of  the  calculus  to  the  identihcation  of  brittleness  in  Bayesian 
inference  and  (2)  the  application  of  the  framework  to  the  automated  identihcation  of 
scalable  linear  solvers  for  PDFs  with  rough  coefficients. 
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Summary  of  the  work  accomplished 

Enabling  compnters  to  think  as  humans  have  the  ability  to  do  when  faced  with  nn- 
certainty  is  challenging  in  several  major  ways:  (1)  Finding  optimal  statistical  models 
remains  to  be  formulated  as  a  well  posed  problem  when  information  on  the  system 
of  interest  is  incomplete  and  comes  in  the  form  of  a  complex  combination  of  sample 
data,  partial  knowledge  of  constitutive  relations  and  a  limited  description  of  the  dis¬ 
tribution  of  input  random  variables.  (2)  The  space  of  admissible  scenarios  along  with 
the  space  of  relevant  information,  assumptions,  and/or  beliefs,  tend  to  be  infinite 
dimensional,  whereas  calculus  on  a  computer  is  necessarily  discrete  and  finite.  This 
program  has  laid  down  the  foundations  for  addressing  these  challenges  by  developing 
the  required  framework  and  calculus. 

A  framework  for  the  scientific  computation  of  optimal  models/estimators. 

The  framework,  described  in  [14],  consists  of  the  full  incorporation  of  computation 
and  complexity  into  a  natural  generalization  of  Wald’s  Statistical  Decision  Function 
framework  [23,  24,  25,  26,  27]  (based  on  a  generalization  of  Von  Neuman’s  Theory  of 
Games  [21,  22]).  In  this  framework  optimal  estimators/models  are  defined  as  optimal 
solutions  of  (minimax)  adversarial  games  in  which  player  A  chooses  the  real  system  in 
an  admissible  set  dehned/constrained  by  available  information  and  player  B  chooses 
the  model/estimator,  sees  data  generated  by  the  real  system  and  must  predict  some 
quantity  of  interest  that  is  a  function  of  the  real  system. 

A  calculus  for  mauipulatiug  iufiuite  dimeusioual  iuformatiou  structures. 

The  resolution  of  these  minimax  problems  require,  at  an  abstract  level,  searching  in 
the  space  of  all  possible  functions  of  the  data.  By  restriction  models  to  the  Bayesian 
class,  the  complete  class  theorem  [27,  1,  3,  19]  allows  to  limit  this  search  to  prior  dis¬ 
tributions  on  the  admissible  set  of  candidates  for  the  real  system,  i.e.  to  measures  over 
spaces  of  measures  and  functions.  To  enable  the  computation  of  optimal  estimators 
this  program  has  therefore  identihed  conditions  under  which  minimax  problems  over 
measures  over  spaces  of  measures  and  functions  can  be  reduced  to  the  manipulation 
of  hnite-dimensional  objects  and  developed  the  associated  reduction  calculus.  For 
min  or  max  problems  over  measures  over  spaces  of  measures  (and  possibly  functions) 
this  calculus  can  take  the  form  of  a  reduction  to  a  nesting  of  optimization  problems 
over  measures  (and  possibly  functions  for  the  inner  part)  [16,  11,  17],  which,  in  turn, 
can  be  reduced  to  searches  over  extreme  points  [18,  20,  2,  13].  Specific  applications 
and  developments  of  this  calculus  are  as  follows.  [2]  has  presented  sufficient  condi¬ 
tions  under  which  an  Optimal  Uncertainty  Quantification  (OUQ,  [18])  problem  can 
be  reformulated  as  a  finite-dimensional  convex  optimization  problem,  for  which  ef- 
hcient  numerical  solutions  can  be  obtained.  The  sufficient  conditions  include  that 
the  objective  function  is  piecewise  concave  and  the  constraints  are  piecewise  con¬ 
vex.  In  particular,  it  has  been  shown  that  piecewise  concave  objective  functions 
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may  appear  in  applications  where  the  objective  is  dehned  by  the  optimal  value  of  a 
parameterized  linear  program.  These  developments  have  been  applied  in  [2]  to  rev¬ 
enue  maximization  with  stochastic  supplies  and  to  the  optimal  control  of  a  power 
network  with  stochastic  demands.  The,  more  fundamental,  results  of  [13,  12,  15] 
have  laid  down  necessary  steps  for  the  identihcation  of  optimal  reduced  models  on 
complex  inhnite-dimensional  spaces  (such  reductions  are  ubiquitous  with  DFT  and 
Navier-Stokes  calculations).  In  particular,  [13]  has  shown  that,  for  the  space  of  Borel 
probability  measures  on  a  Borel  subset  of  a  Polish  metric  space,  the  extreme  points 
of  the  Prokhorov,  Monge-Wasserstein  and  Kantorovich  metric  balls  about  a  measure 
whose  support  has  at  most  n  points,  consist  of  measures  whose  supports  have  at  most 
n  -|-  2  points.  Moreover,  using  the  Strassen  and  Kantorovich-Rubinstein  duality  the¬ 
orems  [13]  has  developed  efficiently  computable  supersets  of  the  extreme  points.  [12] 
has  shown  that  for  a  Gaussian  measure  on  a  separable  Hilbert  space,  the  family  of 
conditional  measures  associated  with  conditioning  on  a  closed  subspace  are  Gaussian 
with  covariance  operator  the  short  of  the  covariance  operator  to  the  closed  subspace. 
[15]  has  demonstrated  that  a  reproducing  kernel  Hilbert  space  of  functions  on  a  sep¬ 
arable  absolute  Borel  space  or  an  analytic  subset  of  a  Polish  space  is  separable  if  it 
possesses  a  Borel  measurable  feature  map. 

Bayesian  Brittleness.  In  the  process  of  its  development  this  calculus  has  been 
applied  to  analyse  the  robustness  of  Bayesian  Inference  under  hnite  information 
[16,  11,  17,  10].  This  analysis  has  uncovered  the  possible  extreme  sensitivity  (brittle¬ 
ness)  of  Bayesian  inference  (in  the  TV  and  Prokhorov  metrics  or  for  Bayesian  models 
that  exactly  capture  an  arbitrarily  large  number  of  hnite-dimensional  marginals  of 
the  data-generating  distribution)  and  suggested  that  robust  inference,  in  a  continu¬ 
ous  world  under  hnite- information,  should  be  done  with  reduced/coarse  models  rather 
than  highly  sophisticated/complex  models  (with  a  level  of  coarseness/reduction  de¬ 
pending  on  the  available  hnite-information)  [17].  More  precisely,  although  Bayesian 
methods  are  robust  when  the  number  of  possible  outcomes  is  hnite  or  when  only 
a  hnite  number  of  marginals  of  the  data-generating  distribution  are  unknown,  they 
appear  to  be  generically  brittle  when  applied  to  continuous  systems  (and  their  dis¬ 
cretizations)  with  hnite  information  on  the  data-generating  distribution.  Further¬ 
more,  if  closeness  is  dehned  in  terms  of  the  total  variation  metric  or  the  matching 
of  a  hnite  system  of  generalized  moments,  then  (1)  two  practitioners  who  use  arbi¬ 
trarily  close  models  and  observe  the  same  (possibly  arbitrarily  large  amount  of)  data 
may  reach  opposite  conclusions;  and  (2)  any  given  prior  and  model  can  be  slightly 
perturbed  to  achieve  any  desired  posterior  conclusions.  The  mechanism  causing  brit- 
tlenss/robustness  suggests  that  learning  and  robustness  are  antagonistic  requirements 
and  raises  the  question  of  a  missing  stability  condition  for  using  Bayesian  Inference 
in  a  continuous  world  under  hnite  information. 
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Automation  of  the  process  of  scientific  discovery.  In  the  process  of  developing 
this  framework  and  calculus  this  program  has  started  addressing  (as  a  direct  appli¬ 
cation  of  the  framework  and  calculus)  the  fundamental  question  of  whether  scientific 
discovery  can  be  computed,  i.e.,  can  the  process  of  scientific  discovery  by  guided  by, 
or  turned  into  an  algorithm?  (in  some  sense  this  question  is  related  to  that  of  whether 
machines  can  think).  This  program  has  addressed  three  notoriously  difficult  examples 
in  which  the  answer  to  the  above  question  is  positive.  The  first  one  concerns  the  iden¬ 
tification  of  new  Selberg  integral  formulae  [11]  (a  notoriously  difficult  problem  of  pure 
mathematics  that  has  been  turned  into  an  algorithm).  The  second  one  concerns  the 
identification  of  accurate,  localized  bases  for  numerical  homogenization/coarse  grain¬ 
ing  with  optimal  recovery  properties  [8]  (a  notoriously  difficult  problem  of  applied 
mathematics  that  has  been  turned  into  an  algorithm).  And  the  third  one  concerns 
the  identification  of  near-linear  complexity  linear  numerical  solvers  [9]  (a  notoriously 
difficult  CSE  problem  that  has  been  turned  into  an  algorithm). 

Gamblets.  This  latter  example  has  lead  the  to  the  discovery  of  Gamblets  [9]  and 
shown  that  the  discovery/ design  of  scalable  numerical  solvers  can  be  addressed/automated 
as  a  UQ  problem  by  reformulating  the  process  of  computing  with  partial  information 
and  limited  resources  as  that  of  playing  underlying  hierarchies  of  adversarial  infor¬ 
mation  games.  As  an  illustration  [9]  has  shown  how  the  application  of  the  proposed 
approach  to  the  resolution  of  elliptic  PDEs  with  rough  coefficients  leads  to  a  near- 
linear  complexity  multigrid/multiresolution  method  with  rigorous  a-priori  accuracy 
and  performance  estimates.  In  this  application,  the  numerical  solver  has  been  dis¬ 
covered  by  identifying  optimal  strategies  for  gambling  on  the  value  of  the  solution  of 
the  PDE  based  on  hierarchies  of  nested  measurements  of  its  solution  or  source  term. 

Development  an  efficient  framework  for  heterogeneous  computing  and  ro¬ 
bust  optimization  This  program  has  continued  the  development  of  (1)  a  com¬ 
putational  job  management  framework  (pathos)  (a  parallel  graph  execution  frame¬ 
work  providing  a  high-level  programmatic  interface  to  high-performance  computing 
http :  //trac  .mystic .  cacr .  caltech.  edu/project /pathos,  [5])  that  offers  a  simple, 
efficient,  and  consistent  user  experience  in  a  variety  of  heterogeneous  environments 
from  multi-core  workstations  to  networks  of  large-scale  computer  clusters  and  (2)  a  ro¬ 
bust  optimization  framework  (mystic)  (a  highly-configurable  optimization  framework 
http://trac.mystic.cacr.caltech.edu/project/mystic,  able  to  drive  material 
science  code  to  fit  structures  [6,  4])  that  incorporates  the  mathematical  framework 
described  in  [18,  7],  and  has  provided  an  interface  to  prediction,  certification,  and 
validation  as  a  framework  service. 

More  precisely,  under  this  program,  asynchronous  computing  capabilities  were 
added  to  pathos.  Worker  pools  now  provide  asynchronous  maps  and  pipes,  as  well 
as  iterative  ordered  and  unordered  asynchronous  variants.  New  asynchronous  condi¬ 
tional  parallel  maps  were  added,  which  are  both  robust  against  failure  and  potentially 
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orders  of  magnitude  faster  than  blocking  maps.  Conditional  maps  terminate  when 
the  desired  (potentially  statistical)  condition  is  met,  as  opposed  to  waiting  for  all 
results  to  return.  The  klepto  package  was  created  to  provide  an  abstraction  for  stor¬ 
age  and  retrieval  of  objects  in  a  database,  in  memory,  or  on  disk,  klepto  provides 
asynchronous  and  distributed  parallel  caching  (as  opposed  to  recalculation)  and  cache 
interpolation  strategies,  and  can  be  used  to  decouple  the  workflow  and  management 
of  ASGs  that  span  distributed  resources,  klepto  also  provides  hierarchical  caching,  so 
for  example  a  fast  local  cache  could  be  maintained  in  memory  with  the  most  recently 
used  entries,  while  a  centralized  global  database  serves  as  a  second  tier  for  all  entries 
not  interpolated  or  found  in  the  fast  local  cache. 

The  majority  of  mystic  was  converted  to  asynchronous  computing,  thus  enabling 
optimization  to  dramatically  scale  in  size  and  complexity.  Optimizers  in  mystic  can 
now  proceed  in  a  step-by-step  iterative  fashion,  potentially  saving  state  at  each  step. 
This  change  enables  mystic’s  optimizers  to  serve  as  a  long-running  daemon  process 
that  dynamically  responds  to  new  information  -  essentially  optimizers  have  been 
converted  to  provide  a  ’’streaming”  or  ’’event”  mode,  to  tackle  real-time  updates  of 
information  about  the  constraints  or  the  cost  function. 

Given  enough  parallel  resources,  mystic’s  ensemble  solvers  demonstrate  orders 
of  magnitude  improvements  in  speed  and  accuracy  over  industry  standard  genetic 
algorithms.  With  the  addition  of  klepto,  mystic’s  ensemble  solvers  were  augmented 
to  provide  N-dimensional  global  search  capabilities.  For  example,  parallel  ensembles 
of  optimizers  can  be  launched  to  search  for  all  critical  points  and  inflection  points  of 
an  unknown  surface,  terminating  only  after  no  further  points  are  found.  The  resulting 
points  can  then  be  fed  into  an  N-dimensional  interpolation  engine,  to  produce  a  fast 
accurate  surrogate  model  for  the  unknown  surface. 

Broader  impact  of  the  work  accomplished.  H.  Owhadi  and  G.  Scovel  have 
been  interviewed  by  HPG  Wire^.  The  Bayesian  Brittleness  papers  have  generated 
signihcant  blog  activity^.  Gamblets  have  been  presented  at  a  plenary  lecture  at  SIAM 
GSE  2015^.  H.  Owhadi  is  co-editing  “the  Handbook  of  Uncertainty  Quantihcation” 
(Springer)  with  R.  Ghanem  and  D.  Higdon.  M.  McKerns  is  editing  a  chapter  in  that 
book  (on  software  aspects).  H.  Owhadi  has  been  invited  (by  Dr.  Bruce  Suter  DR-04 
USAF  AFMG  AFRL/RITB)  to  AFRL,  Rome  NY  to  present  and  discuss  the  results  of 
[9] .  Schlumberger  is  exploring  the  incorporation  of  the  results  of  [9]  into  its  subsurface 
flows  software.  Gamblets  have  lead  to  a  provisional  patent  (number  62/130,374). 

^See  www.hpcwire . com/2013/09/13/the_masters_of _uiicertainty/ 

^See  for  instance  http://errorstatistics.eom/2015/01/08/ 

on-the-brittleness-of-bayesian-inf erence- cLn-update- owhadi- and- scovel-guest-post/ 

^See  https : //www . pathlms . com/ siam/ courses/ 1043/ sections/ 1259/thumbnail_video_ 

presentations/9883 
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